feat: add OpenAI diarization support#651
Conversation
ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (6)
✅ Files skipped from review due to trivial changes (5)
📝 WalkthroughWalkthroughAdds end-to-end speaker diarization for OpenAI transcription: new ChangesOpenAI Transcription Diarization Feature
Sequence DiagramsequenceDiagram
participant Adapter as OpenAI Adapter
participant Validator as validateDiarizationOptions
participant Mapper as mapResponseFormat
participant OpenAI as OpenAI API
participant Parser as Diarized Parser
Adapter->>Adapter: Identify diarization-capable model
Adapter->>Validator: Validate diarization options
Validator-->>Adapter: Constraints enforced
Adapter->>Mapper: Map responseFormat
Mapper-->>Adapter: diarized_json selected or mapped format
Adapter->>OpenAI: Create transcription request (response_format, chunking_strategy)
OpenAI-->>Adapter: Diarized or non-diarized response
Adapter->>Parser: Map segments with speaker labels
Parser-->>Adapter: TranscriptionSegment[]
Adapter-->>Adapter: Return structured transcription result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/ai-openai/src/adapters/transcription.ts`:
- Around line 267-285: The diarization validation is missing a local guard for
responseFormat: update validateDiarizationOptions (used by transcribe and
guarded by isDiarizeTranscriptionModel) to throw when
modelOptions.responseFormat (or the mapped value from mapResponseFormat) is not
one of the allowed values ["json","text","diarized_json"]; ensure transcribe()
cannot send srt/vtt/verbose_json for diarize models by checking
modelOptions.responseFormat (or resolved response format) early and throwing a
clear error stating diarization models only support json, text, and
diarized_json; reference validateDiarizationOptions, transcribe,
mapResponseFormat, and isDiarizeTranscriptionModel when applying the change.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 7c4b4b31-fb90-4e00-9d8f-1454f513e089
📒 Files selected for processing (13)
.changeset/openai-transcription-diarization.mddocs/adapters/openai.mddocs/comparison/vercel-ai-sdk.mddocs/media/generation-hooks.mddocs/media/transcription.mddocs/reference/interfaces/TranscriptionOptions.mdpackages/ai-client/src/generation-types.tspackages/ai-openai/src/adapters/transcription.tspackages/ai-openai/src/audio/transcription-provider-options.tspackages/ai-openai/tests/transcription-adapter.test.tspackages/ai/skills/ai-core/media-generation/SKILL.mdpackages/ai/src/activities/generateTranscription/index.tspackages/ai/src/types.ts
|
Actionable comments posted: 0 |
05dfb53 to
fbb57a0
Compare
|
View your CI Pipeline Execution ↗ for commit fbb57a0
☁️ Nx Cloud last updated this comment at |
@tanstack/ai
@tanstack/ai-anthropic
@tanstack/ai-client
@tanstack/ai-code-mode
@tanstack/ai-code-mode-skills
@tanstack/ai-devtools-core
@tanstack/ai-elevenlabs
@tanstack/ai-event-client
@tanstack/ai-fal
@tanstack/ai-gemini
@tanstack/ai-grok
@tanstack/ai-groq
@tanstack/ai-isolate-cloudflare
@tanstack/ai-isolate-node
@tanstack/ai-isolate-quickjs
@tanstack/ai-ollama
@tanstack/ai-openai
@tanstack/ai-openrouter
@tanstack/ai-preact
@tanstack/ai-react
@tanstack/ai-react-ui
@tanstack/ai-solid
@tanstack/ai-solid-ui
@tanstack/ai-svelte
@tanstack/ai-utils
@tanstack/ai-vue
@tanstack/ai-vue-ui
@tanstack/openai-base
@tanstack/preact-ai-devtools
@tanstack/react-ai-devtools
@tanstack/solid-ai-devtools
commit: |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/media/transcription.md`:
- Line 561: The example hardcodes 'whisper-1' in the createOpenaiTranscription
call; update the docs to use the provider's latest transcription model constant
exported from the OpenAI adapter's model-meta.ts instead of a string literal.
Import or reference the exported latest-model symbol from that file (e.g., the
adapter's LATEST_* or DEFAULT_* transcription model constant) and pass that
symbol into createOpenaiTranscription so the docs always use the adapter-defined
current OpenAI transcription model.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b2c455a0-25d6-4921-8f26-77965d2791be
📒 Files selected for processing (6)
.changeset/openai-transcription-diarization.mddocs/adapters/openai.mddocs/comparison/vercel-ai-sdk.mddocs/media/generation-hooks.mddocs/media/transcription.mddocs/reference/interfaces/TranscriptionOptions.md
✅ Files skipped from review due to trivial changes (5)
- .changeset/openai-transcription-diarization.md
- docs/media/generation-hooks.md
- docs/comparison/vercel-ai-sdk.md
- docs/adapters/openai.md
- docs/reference/interfaces/TranscriptionOptions.md
There was a problem hiding this comment.
Caution
Inline review comments failed to post. This is likely due to GitHub's internal server error or limits when posting large numbers of comments. If you are seeing this consistently it is likely a permissions issue. Please check "Moderation" -> "Code review limits" under your organization settings.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@docs/media/transcription.md`:
- Line 561: The example hardcodes 'whisper-1' in the createOpenaiTranscription
call; update the docs to use the provider's latest transcription model constant
exported from the OpenAI adapter's model-meta.ts instead of a string literal.
Import or reference the exported latest-model symbol from that file (e.g., the
adapter's LATEST_* or DEFAULT_* transcription model constant) and pass that
symbol into createOpenaiTranscription so the docs always use the adapter-defined
current OpenAI transcription model.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: b2c455a0-25d6-4921-8f26-77965d2791be
📒 Files selected for processing (6)
.changeset/openai-transcription-diarization.mddocs/adapters/openai.mddocs/comparison/vercel-ai-sdk.mddocs/media/generation-hooks.mddocs/media/transcription.mddocs/reference/interfaces/TranscriptionOptions.md
✅ Files skipped from review due to trivial changes (5)
- .changeset/openai-transcription-diarization.md
- docs/media/generation-hooks.md
- docs/comparison/vercel-ai-sdk.md
- docs/adapters/openai.md
- docs/reference/interfaces/TranscriptionOptions.md
🛑 Comments failed to post (1)
docs/media/transcription.md (1)
561-561:
⚠️ Potential issue | 🟡 Minor | ⚡ Quick winUse the provider’s latest OpenAI transcription model in this example.
This changed snippet still hardcodes
whisper-1; please update it to the latest OpenAI transcription model defined in the adaptermodel-meta.tsto keep docs aligned with project policy.As per coding guidelines: “Use the latest model per provider in documentation example code, sourced from each adapter's
model-meta.ts(newestgpt-*,claude-*,gemini-*, …)”.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/media/transcription.md` at line 561, The example hardcodes 'whisper-1' in the createOpenaiTranscription call; update the docs to use the provider's latest transcription model constant exported from the OpenAI adapter's model-meta.ts instead of a string literal. Import or reference the exported latest-model symbol from that file (e.g., the adapter's LATEST_* or DEFAULT_* transcription model constant) and pass that symbol into createOpenaiTranscription so the docs always use the adapter-defined current OpenAI transcription model.
|
Hi @8times4, thank you for this. Would you be able to create an e2e test for this using aimock? The tests are in the e2e test package. Ideally, adding a way to see the results on one of the ts-react-chat example pages would be great as well |
Code reviewFound 3 issues:
ai/packages/ai-openai/src/adapters/transcription.ts Lines 140 to 150 in 05dfb53
Lines 1723 to 1732 in 05dfb53
ai/packages/ai-openai/src/adapters/transcription.ts Lines 339 to 370 in 05dfb53 🤖 Generated with Claude Code - If this code review was useful, please react with 👍. Otherwise, react with 👎. |
🎯 Changes
This change adds diarization support for OpenAI's gpt-4o-transcribe-diarize model, based on https://developers.openai.com/api/docs/guides/speech-to-text?lang=javascript
✅ Checklist
pnpm run test:pr.🚀 Release Impact
Summary by CodeRabbit
New Features
diarized_jsonresponse format with speaker-labeled segmentsDocumentation
Tests